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Abstract 

In the field of education, most of the data collected are multi-level structured. Cities, city based schools, school 
based classes and finally students in the classrooms constitute a hierarchical structure. Hierarchical linear models 
give more accurate results compared to standard models when the data set has a structure going far as 
individuals, groups of individuals and communities of groups. In this study, the effects of the school level 
indicators on overall reading skills performance of 15 year-old group students within PISA 2015 Turkey 
application, are analyzed by using a two level hierarchical linear model. In the study, socioeconomic indicators 
of family, education level of the parents and some student level indicators regarding reading skills, school level 
indicators such as school type, number of students and number of teachers are also integrated to modelling 
process to reflecting the hierarchical data structure into the statistical model. At the end of the modelling process, 
factors that effects the reading skills are determined. 
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1. Introduction 

The Programme for International Student Assessment (PISA) is a survey that is conducted every 
three years to evaluate the knowledge and skills acquisition of the fifteen-year-olds in the leading 
industrialized countries. The survey is an outcome of collaboration among the governments of the 
countries that are the members of the Organization for Economic Co-operation and Development 
(OECD), and it makes use of international expertise services to make valid comparisons between 
countries and cultures. This study will determine the extent to which fifteen-year-old students are 
prepared to overcome the future problems of an information society. It also assesses the students’ 
understanding of the complex reading materials they see in their daily lives. Another PISA goal is to 
see how well fifteen-year-old students can put into practice the things they learn in school 
mathematics and science in a world order that is rather based on technology and scientific 
development, revealing the strength of students’ knowledge and skills that are required to participate 
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in social life effectively. The PISA survey determines the influence on performance of learning 
motivation, interest in courses and the choice of learning type as observed in students (MEB- 
EARGED, 2005: 1-2). 

The main focus of PISA surveys is to assess students’ skills in reading, mathematical and scientific 
literacy. In framework of the project, each period focuses on only one of these subject areas. However, 
the other two areas are included in the assessment as well. In other words, one of these subject areas is 
the primary focus within a nine-year cycle. The first PISA survey was done in the year 2000 and 
focused on reading skills as the main area. The 2003 survey studied mathematical literacy, and the 
2006 survey studied scientific literacy. In 2009, new cycle began with reading skills was the dominant 
subject area (MEB-EARGED, 2010: 2). The 2015 implementation was the sixth PISA survey, and its 
dominant subject area was scientific literacy, with fewer questions related to reading skills (Ta§ et al., 
2016: 22). 

Reading skills are focused on students’ skills to use written information in actual situations 
(Sanqoban and Alyas, 2012). In the PISA survey, reading skills are described as, “a person’s 
comprehending, using, thinking of and engaging with written materials with the purpose of 
participating in the society, improving their potential and knowledge, and achieving their goals” This 
description goes beyond the traditional concept of analyzing information and comprehending a written 
text. The concept of reading skills in PISA involves a series of situations where people read and ways 
of presenting written texts (e.g., printed books, memos, online boards and news). It also involves 
readers’ different ways of approaching and using texts from the functional and limited (e.g., finding 
specific practical information) to the deep and comprehensive (e.g., othering, understanding the ways 
to think and exist). The PISA reading skills evaluation has three dimensions: the text, the reader’s 
approach to the text and the text’s purpose, although it is not possible to demarcate these three subject 
areas strictly due to their interdependency (Ta§ et al., 2016: 22). 

In PISA 2015, fifteen-year-old students were asked 103 questions to evaluate their capabilities in 
the reading skills subject area. Students’ competence levels are determined based on their responses to 
survey questions. 

A review of the reading skills literature found a variety of statistical methods that are used to 
determine the factors that influence achievement. This study claims that most of the data collected in 
the area of educational research have a multi-level structure, which leads to the use of advanced 
statistical methods. To determine the influential factors in student achievement on PISA 2009, 
Giirsakal (2012) used logistic regression analysis, while Ozdemir and Gelbal (2014) used canonical 
common effect analysis. A review of the studies of reading skills indicated that Giilleroglu, Bilican 
Demir and Demirta§h (2014) analyzed the 2003, 2006 and 2009 PISA data using stepwise multiple 
regression analysis and identified the best independent variables that affect achievement in reading 
skills. Similarly, Av§ar and Yalgin (2015) examined achievement in reading and used chi-squared 
automatic interaction detection to evaluate the familial variables that presumably affected 
achievement. 

On the other hand, this study uses a two-level hierarchical linear model to reveal the influential 
factors in reading skills in PISA 2015. Hierarchical linear models have a multi-level structure, and 
they are commonly used in statistical analyses. They are very useful models since their data structure 
clusters individuals with similar characteristics in the same groups. The data researchers encounter in 
studies in education, health, economy and social sciences mostly have a multi-level structure. In the 
field of education, the hierarchical structure consisted of provinces, schools based on the provinces, 
groups based on the schools and finally, the students in the groups. 
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2. Method 

A total of 72 countries, including 35 OECD countries, participated in PISA 2015. In this study, the 
effects of school-level variables on the overall achievement of reading skills of 15-year-olds students 
in Turkey, were analyzed using a two-level hierarchical linear model. Gender, the social economic 
cultural index indicators of the family, the education level of the parents and student based reading 
skills are set as student level valuables whereas the school type, number of students, number of 
teachers are set as school level variables. Both groups are introduced to the modeling process in order 
to reflect the hierarchical data structure to statistical model. 

In the presence of multi-level structured data where the structure consisted of individuals, groups 
based on the individuals, communities based on the groups, hierarchal linear models result in better 
compared to standard models (Bryk and Raudenbush, 2002). 

2.1. Population and Sample 

In Turkey, 1324089 fifteen-year olds students are educated, however the reachable sample is 
assigned as 925366 students. For PISA 2015, the sampling method is obtained as stratified random 
sampling based on schools and at the end of the sampling process, 187 schools are selected for the 
survey. Among the fifteen-year olds students in the selected schools, 5895 of them are participated to 
PISA 2015 Turkey application by using simple random sampling. 

In the Turkey sample of PISA 2015, the distribution of girls and boys are homogeneous with equal 
frequencies, there are 2938 girls and 2957 boys. The distribution of students depending on the school 
type is more heterogeneous compared to other school level variables. The frequency of students in 
basic education has a small amount, approximately 2%. There are 121 students in basic education, 
3241 in general secondary education and 2533 in vocational technical education. Unlike the previous 
PISA surveys, in 2015, private schools are also included in Turkey sample with a proportion of 4%. 7 
schools are private and 179 are public out of 187 schools. 

2 . 2 . Data Collection Method 

In PISA, data are collected using two separate surveys, which are student-based survey and school- 
based survey. In the modeling process, answers of the student-based survey set as first-level data 
whereas the answers of school survey set as second-level data. The first level variables are obtained as 
reading skill value, social economic cultural index and the education levels of each parents The second 
level variables are school type (basic/ general secondary/ vocational and technical), regions, kind of 
school (private/public) and the number of teachers per student. 

“The test design for PISA is based on a variant of matrix sampling where each student was 
administered a subset of items from the total item pool. To increase the accuracy of the measurement, 
PISA uses plausible values which are multiple imputations drawn from a posteriori distribution by 
combining the IRT scaling of the test items with a latent regression model using information from the 
student context questionnaire in a population model” (OECD, 2016). 

In the two-level hierarchical linear model that will be used to determine the factors affecting the 
success of reading skills, the reading skills indicator is determined as the dependent variable. This 
indicator was included in modeling process by averaging the plausible values presented under PISA 
2015. The plausible values for reading skills, are named as PV1READ, PV2READ, PV3READ, 
PV4READ, PV5READ, PV6READ, PV7READ, PV8READ, PV9READ and PV10READ. 
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2.3. Data Analysis 

In the analysis of the data, hierarchical modeling method is used which is also a multi-level 
modeling technique. The samples obtained from the hierarchical structured main population are called 
as multi-level samples. It is assumed that, firstly, sub-groups are drawn from groups and then samples 
are drawn from sub-units included in these sub-groups. In this data structure, the sub-units within the 
groups tend to be more similar than the units drawn by simple random sampling from the main 
population. The reason for this resemblance is that the observations in groups are correlated to each 
other. Common environmental factors and having similar demographic conditions will create 
dependency between the units in the same group. In this case, the assumption of independence 
required for the standard models is not provided. Hierarchical linear models can be used as a solution 
where the observation units are correlated to each other due to the hierarchical structure between them. 


3. Hierarchical Linear Models 

Multi-level modeling techniques were first introduced by Aitkin and Longford (1986) and in the 
following years, owing to the developed computer programs, they have been used in many areas such 
as psychology, sociology, demography, econometrics and mainly in the field of education. By the end 
of the 90's Bryk and Raudenbush (1992) developed a hierarchical linear modeling technique as one of 
the multi-level modeling techniques. 

Hierarchical linear models conceptually appear to be a hierarchical system of regression model 
equation. The use of known regression models in the presence of dependency resulting from the 
hierarchical data structure will cause the estimates to be unbiased and misleading. In this case, instead 
of the known regression models, it would be correct to use multi-level techniques, especially 
hierarchical models. 

In the hierarchical modeling technique, a statistical model is established for each stage of the 
hierarchical structure and then these models are combined to reach final model. In multi-level models, 
the combined model has a very complex structure and it is hard to interpret. Therefore, sub-models 
belonging to stages are used in interpretations. Hierarchical linear models can be considered as two or 
more level. Although in applications, it possible to encounter three or four-level hierarchical 
structures, mostly a two-level hierarchical linear model is used. 

In a two-level model, first level is individual level and the second level is belong to groups where 
the first level units come from. There isn’t a requirement that group sizes are equal to each other. 

Let Yjj be the response variable for the i th unit of the j* group. The first level model can be written 
as: 


Y ij ~ Poj + P\i X \ij + Plj X 2ij +■■■■ + PqjXqtj + r ij 

( 1 ) 

m 

q =i 


where k is the number of groups and is the size of the j ,h group (/ = 1, 2, ..., nj ), (j = 1, 2, ..., k ). In 

Equation 1, J3 qj is the constant term, whereas /3 lj , f:f _• ... f3 qj represents the m slope parameters for 
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the first level ( q = 1,2, ..., m). X Uj , X 2ij X qjj are the explanatory variable measurements for the i th 
unit of the j th group. r tj is the error term assumed to distributed normally with 0 mean and variance 
CT 2 , ^ ~ N (0. a 2 ). 

The first level model given in Equation 1 is similar to the known regression model. However, in 
hierarchical linear modeling, unlike classical regression analysis, the values of J3 lj , /?,. ... [:) qj , 
which are the slope parameters of the main model, are estimated with the help of second level sub¬ 
models. Estimated values are replaced in the main model to achieve the desired predicted values Y t] . 
Second level sub-models can be written as: 

Poj ~ Y 00 ^Yoi^lj + + u 0j 

Pij =rio + rnW lj +y l2 w 2j +~~+/ ln w nj +u Xj (2) 

Pmj = Y m0 +Yml W lj + /m2 W 2j + "" + Ymn W nj + U mj 

where n is the number of second level independent variables (p = 1,2, n). Similar to the 

representation in the main model, y Q0 ,y lQ Y m o ( m+ l) arc the constant terms and y (U , y ()2 ,..., 

Yon • Y\ 1 ’ Y \2 v• > Yin »• • • > Ymi - Ymi ’ • • • > Y are the slo P e parameters of the sub-models. W Xj , W 2j 
W nj represent the value of n independent variables. Finally, last component of each mode, u itj , u , ; , 
..., u mJ are the error terms and distributed normally with mean 0 and normal variances 
T oo’ T n’--’ T mm’ respectively. 

The complex structured main model can be written as in Equation 3 by substituting the sub-model 
models (2) in the first level model (1): 

m n n m m 

Y n = r 0 o+J]r q o x gij +'Z J Yo P W pj+'ZjTjy<tP W Pj x w +u oj + Tj u <,j x w +r ii 

q =1 p =1 p=l q=\ q=\ 


In multi-level models, the merged model has a very complex structure and is difficult to interpret. 
Thus, for convenience, sub-models are used in calculations and interpretations. The most appropriate 
model for the grouped data is obtained by comparing sub-models step by step. The least squares 
estimation method is used in parameter estimation process of two level hierarchical linear model (Bryk 
and Raudenbush, 2002). 
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4. Application and Results 

In this study, the effects of student and school level variables, which are considered to have an 
effect on average reading skills, were analyzed by a two-level hierarchical linear model within the 
scope of PISA 2015 Turkey application. In the implementation phase of the analysis, the seventh 
version of the HLM program, HLM7 which is developed for hierarchical modeling by Raudenbush, 
Bryk, et. al. (2011), is used. In the modeling phase, missing observations were excluded from the 
analysis and a total of 5708 students from 176 schools were modeled. The variables used in modeling 
process are given in Table 1. 


Table 1. Variables used in modeling process 


Dependent 

Student-Level (Level 1) 
Independent Variables 

School-Level (Level 2) 
Independent Variables 

Reading-Skill 

Economic social cultural index 
(ESCS) 

Mother’s education level (MEL) 

School type 

Kind of school 

Region 

Teacher per student 


Father’s education level 
(FEL) 


The effects and their statistical significance of the variables analyzed in the study are summarized 
in Table 2: 


Table 2. Estimation of the student level effects and school level effects 



Variables 

Coefficient 

Standard Error 

p-value 


ESCS 

25,167 

8,287 

0,002* 

Level 1 

MEL 

-0,028 

3,736 

0,994 

FEL 

-11,693 

4,376 

0,008* 


Type 

-20,861 

6,989 

0,003* 


Kind 

6,440 

20,612 

0,755 

Level 2 

Teacher 

0,053 

0,119 

0,654 


Region 

-6,184 

1,129 

<0,001* 


*: significant at a=0.05 


5. Conclusion 

Depending on the two-level hierarchical model results given in Table 2, it can be said that, among 
the first level indicators, economic social cultural index ESCS (p=0,002 < a=0.05) and the education 
level of the father FEL (p=0,008 < a=0.05) are statistically significant whereas there is no effect of 
mother’s education level MEL on reading skill achievement. The effect of school type (basic 
education/ general secondary education/ vocational and technical education) (p=0,003 < a=0.05) and 
the region that the school is located (p<0,001) are also statistically significant at level a=0.05. There is 
no difference between the mean success of private school and public school (p=0,755 > a=0.05). 
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Similarly, the numbers of teacher per student has no significant effect on the mean reading skill 
performance of the student (p=0,654 > a=0.05). 
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PISA 2015 Turkiye sonuglarimn iki diizeyli hiyerar§ik dogrusal modelleme ile 

analizi 


Oz 

Egitim alanmda toplanan bin;ok veri yapisi, 50 k diizeyli yapidadir. iller, iller bazinda okullar, okullar bazinda 
smiflar ve son olarak smiflardaki ogrenciler hiyerar^ik bir yapi te§kil etmektedir. Hiyerar^ik dogrusal modeller, 
birey a^amasmdan ba^layarak, bireyler, bireylerden meydana gelen gruplar ve gruplardan meydana gelen 
topluluklar §eklinde devam eden hiyerar^ik bir yapiya sahip veri kiimesi varhgmda, standart modellere gore daha 
dogru sonuijlar veren modellerdir. Bu gah^mada. PISA 2015 Tiirkiye uygulamasi kapsaminda 15 ya§ grubu 
ogrencilerin okuma becerileri alamndaki genel ba^anlan iizenndeki okul diizeyi degi^kenlerinin etkileri iki 
diizeyli bir hiyerar^ik dogrusal model ile analiz edilmifjtir. £ali§mada ailenin sosyo-ekonomik gostergeleri, 
ebeveynlerin egitim diizeyleri ve okuma becerilerine ili§kin birtakim ogrenci diizeyi degi^kenlerin yam sira, 
ogrencinin egitim gordiigii okula ili^kin okul tiirii, okuldaki ogrenci ve ogretmen sayisi gibi okul diizeyi 
degi^kenleri de modelleme siirecine dahil edilerek, ogrenci ve okul arasindaki hiyerar^ik veri yapisi istatistiksel 
modele yansitilmi^tir. Modelleme sonucunda, okuma becerileri iizerinde etkisi olan degi^kenler tespit edilmi^tir. 

Anahtar sozcukler. PISA 2015; okuma becerileri; 50 k diizeyli modelleme; hiyerar^ik dogrusal model; okul 
diizeyi degi^kenleri 


AUTHOR BIODATA 

Dr. Dogu Ata§ is currently working in the Division of German Language Education at Hacettepe University, 
Ankara, Turkey. He received a Ph.D. degree in German Language Education from the Hacettepe University in 
2016, an M.A. degree in German Language Education from Gazi University in 2010 and a B.A. degree in 
German Language Education from Hacettepe University in 2007. His interests are professional languages, 
materials development and vocational education. 


Dr. Ozge Karadag is currently working at the Department of Statistics, Hacettepe University, Ankara, Turkey. 
She received a Ph.D. degree at Department of Statistics from Hacettepe University in 2016, an M.Sc. degree at 
Department of Statistics from Hacettepe University in 2011 and a B.Sc. degree at Department of Statistics from 
Hacettepe University in 2008. Her current interests are statistical modeling, statistical genetics and genome wide 
association analysis. 



